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Abstract 

It was experimentally observed that the majority of real-world networks are scale-free and follow 
power law degree distribution. The aim of this paper is to study the algorithmic complexity of such 
“typical” networks. The contribution of this work is twofold. 

First, we define a deterministic condition for checking whether a graph has a power law degree 
distribution and experimentally validate it on real-world networks. This definition allows us to derive 
interesting properties of power law networks. We observe that for exponents of the degree distribution 
in the range [1,2] such networks exhibit double power law phenomenon that was observed for several 
real-world networks. Our observation indicates that this phenomenon could be explained by just pure 
graph theoretical properties. 

The second aim of our work is to give a novel theoretical explanation why many algorithms run faster 
on real-world data than what is predicted by algorithmic worst-case analysis. We show how to exploit 
the power law degree distribution to design faster algorithms for a number of classical P-time problems 
including transitive closure, maximum matching, determinant, PageRank and matrix inverse. Moreover, 
we deal with the problems of counting triangles and finding maximum clique. Previously, it has been 
only shown that these problems can be solved very efficiently on power law graphs when these graphs 
are random, e.g., drawn at random from some distribution. However, it is unclear how to relate such a 
theoretical analysis to real-world graphs, which are fixed. Instead of that, we show that the randomness 
assumption can be replaced with a simple condition on the degrees of adjacent vertices, which can be 
used to obtain similar results. Again, we experimentally validate that many real-world graphs satisfy our 
property. As a result, in some range of power law exponents, we are able to solve the maximum clique 
problem in polynomial time, although in general power law networks the problem is NP-complete. 

In contrast to previously done average-case analyses, we believe that this is the first “waterproof” 
argument that explains why many real-world networks are easier. Moreover, an interesting aspect of 
this study is the existence of structure oblivious algorithms, i.e., algorithms that run faster on power law 
networks without explicit knowledge of this fact or explicit knowledge of the parameters of the degree 
distribution, e.g., algorithms for maximum clique or triangle counting. 
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1 Introduction 


One of the most interesting observations in our understanding of complex networks is that for most large 
networks the degree distribution closely resembles a power law distribution [2], i.e., the number of nodes 
of degree d is proportional to for some a > 1. Such networks are called scale-free and many models 
explaining their emergence have been proposed - the most important one being the preferential attachment 
model [1]. The aim of this work is to study the algorithmic complexity of such “typical” networks and its 
contribution is twofold. 

First, we define a deterministic condition for checking whether a graph has a power law degree distribu¬ 
tion and show that many real-world networks satisfy it. Graphs satisfying this condition are called power 
law bounded networks (PLB)13 This definition allows us to derive new interesting properties of power law 
networks. We observe that for a G [1,2] PLB graphs with no parallel edges (simple graphs) need to exhibit 
double power law phenomenon. This means that the degree distribution of vertices with sufficiently high 
degrees is different and has higher exponent. This faster decay in the distribution was observed for some 
existing simple graphs and usually was attributed to some complex processes |31j . Our results indicate that 
this phenomenon may have a basic explanation that uses only pure graph theoretical properties. Essentially, 
we show that when a G [1,2] there are not enough low degree vertices that can be connected to high degree 
vertices, and so the number of high degree vertices needs to be lower and cannot be proportional to d~°‘. 
This observation implies that for a G [1, 2] simple PLB graphs have only 0{r?l°') edges. This contrasts with 
the expected number of edges in power law multigraphs which is 

The second contribution of this paper is the attempt to reduce the dichotomy in current research in algo¬ 
rithms, where two rarely interacting directions are pursued. On one hand, theoreticians work on optimizing 
the performance of algorithms in the worst-case model. This is an important line of research that has given 
us some beautiful algorithms and solutions. There are many success stories: a number of practically efficient 
algorithms have been developed only thanks to this rigorous worst-case model, e.g., Dijkstra shortest paths 
algorithm. On the other hand, there are problems where the best solutions that are used in practice have 
nothing in common with the state-of-art algorithms proposed by theoreticians. This is clearly visible in 
the case of the Steiner tree problem, as exemplified by last year’s DIMACS implementations challenge. As 
shown, e.g., in m the algorithm of Byrka et al. |13] with the best known theoretical approximation ratio, 
cannot be used on instances of larger size, because it is too inefficient. Moreover, even on instances of smaller 
size it delivers worse results than the best metaheuristic approach based on local search [50]. The number 
of examples where heuristic approaches outperform “worst-case” algorithms is enormous. Intuitively, this is 
due to the fact that when one prepares for the worst case then the typical case will be handled in suboptimal 
way. Standard ways of overcoming this shortcoming are to work with stochastic models or random graphs, 
or use smoothed analysis. For example, in online stochastic models it is sometimes possible to obtain better 
bounds on expected cost of the algorithm than what is implied by worst-case competitive ratio [25l |29| . On 
the other hand, there are cases where smoothed analysis allows us to obtain polynomial running time in 
expectation instead of exponential one [IH] . 

However, the answers given by these stochastic models are still far from being satisfactory. Consider the 
rumor spreading process in a social network, e.g., Twitter. It was observed that rumors spread extremely 
fast in such networks. The paper [3D] tries to give the following explanation for this observation. Social 
networks have properties similar to networks obtained from preferential attachment model |4] , so one tries to 
argue that fast spread of rumors in such random networks explains the rapid spread of rumors in real-world 
networks. This explanation has the following shortcomings. First, it has been observed that although many 
properties of social networks are explained well by this model, there are some properties that are not captured 
by it. For example a better model is to use affiliation networks [38] . Even if social networks were random we 
would newer know that we have a precise model for them, as we might always miss some important property. 
Hence, this argument is far from explaining the observations. Second, there exists just one instantiation of 
any social network and there is no way we can see distribution of all random Twitter networks that is needed 
for this argument. Besides, as there is just one example of a social network it might be the unlucky one for 

^For formal definition see Definition 3.1 
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the stochastic model that lies outside the whp statement. Finally and most importantly, social networks are 
not random at all! They represent real-world ties, e.g., friendships which are far from begin random. 

In this paper we introduce the concept of a PLB network, which gives a novel “waterproof” worst-case 
approach that overcomes the aforementioned problems and explains why many real-world networks are easier. 
We prove that on PLB networks many problems have lower complexity than what is implied by classical 
solutions. The problems that we are able to solve faster include basic P-time problems: transitive closure, 
perfect matching, PageRank and counting triangles. Additionally, we show that the NP-hard problem of 
finding maximum clique allows a subexponential time algorithm in PLB networks. An interesting aspect of 
this study is the existence of structure oblivious algorithms, i.e., algorithms that run faster on PLB networks 
without explicit knowledge of this fact. These structure oblivious algorithms shed some light on why some 
existing heuristic approaches are so efficient in practice, e.g., sorting vertices by degrees is the first step in 
many heuristic approaches to maximum clique problem El- 

Explaining why many algorithms work faster on real-world instances than what is predicted by worst-case 
analysis is one of the grand questions in algorithm that did not receive a plausible answer so far. A notable 
example is the SAT problem [55]. Our paper gives a possible answer to this grand challenge and calls for 
further research in this direction. On one hand, we shall search for faster solutions to other problems. On 
the other hand, we believe that real-world power law networks have more worst-case graph properties that 
can be exploited in the design and analysis of algorithms. In particular, we have observed that in a number 
of power law graphs with a > 2, every vertex of degree k has o{k) neighbors of degree at least k (we say 
that the graph has PLB neighborhoods)]^ 

We have experimentally confirmed that this property is present in a number of real-world networks. 
This property can effectively replace the randomness assumption about the graph that has been introduced 
in previous works and we use it to obtain faster algorithms for counting triangles and maximum clique 
problem. In particular it implies that for a > 3 our maximum clique algorithm works in polynomial time. 
This observation clearly contrasts with the proof that the clique problem is NP-hard on power law networks 
for any a > 1 |24j . and implies that it should be possible to efficiently find maximum cliques in numerous 
real-world networks, in which a > 3. 

1.1 Our Results and Related Work 

We study the algorithmic complexity of power law networks in a worst-case model. Our work is somewhat 
related to the area of average-case analysis of algorithms, which tries to explain why some algorithm are fast 
on real-world data. However, we do not use the randomness of the data. Instead, we identify graph properties 
that can be exploited to give efficient algorithms. We stress that we are only interested in properties that 
can be decided deterministically. We also show the our model is general, by proving that one of the basic 
random power law network model generates PLB graphs with high probability. 

Counting Triangles The problem of finding or counting triangles in a graph can be solved in 0{n‘^) 
time or in ) time using fast matrix multiplication j^. There has been some work that tried to show 

faster algorithms for counting triangles in power law graphs. Latapy m has shown two time 

algorithms, where m is the number of edges in the graph. Moreover, Berry et. al [7] have shown that in 
random power law graphs, generated by erased configuration model, triangles can be counted in 
time, where A is the maximum vertex degree in the graph. Since the model assumes that Aj^/m < 1/2, for 
a € (2, 7/3) this gives a time algorithm (a > 2 implies m = 0(n)) and a linear time algorithm 

for a > 1 jZ. However, as the authors admit this algorithm requires the graph to be random and does not 
fully apply to real-world graphs. In addition, the assumption that Aj^Jm <1/2 may be unrealistic, as it is 
satisfied in only few of the real-world networks that we have analyzed (see Table . 

We show that a very basic and widely used triangle counting algorithm works faster than what has 
been demonstrated by Latapy. This simple algorithm processes nodes in increasing order of their degrees, 
computes the number of triangles incident to each vertex, and then removes the processed vertex. A simple 

^For formal statement see Definition 3.9 
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analysis shows that this algorithm runs in 0{n^^°‘) time for 1 < a < 3, 0(nlogn) time for a = 3, and 0{n) 
time for a > 3. Additionally, for graphs with PLB neighborhoods this algorithm runs in time 

for 2 < a < 7/3, and 0{n) time for a > 1 jZ. These bounds visibly improve the running time of Latapy’s 
algorithm for a > 2 and match the results of Berry et. al m (up to logarithmic factors) that have been 
obtained under full-randomness assumption. Moreover, when applied to random networks as in [7], our 
framework implies stronger whp bounds instead of bounds in expectations. We note that our algorithms are 
structure oblivious and do not need to know that the graph is PLB or has PLB neighborhoods to run in the 
above bounds. These running times are shown in Fig. [^and can be slightly improved by using fast matrix 
multiplication. 

Maximum Clique The fastest algorithm for finding maximum cliques in general graphs runs in 0(1.2125") 
time m- Moreover, Chen et. al m have shown that maximum clique cannot be solved in subexponential 
time unless exponential time hypothesis fails. We note that the maximum clique problem is NP-hard on 
power law graphs |24j . Janson, Luczak and Norros |33j have shown that for a > 2 maximal clique in a power 
law graph can be found in polynomial time and approximated for any a. However, they assume that the 
graph is created using random Poissonian model. In this paper we show that on PLB graphs the problem 
can be solved in subexponential exp(0(n^/“)) time. Additionally, when the graph has PLB neighborhoods 
our algorithm runs in exp(0(n^/^““/^ logn)) time for 2 < a < 3 and 0(poly(n)) time for a > 3. 

Transitive Closure The transitive closure of a graph G can be either computed in 0{nm) time by 
executing n graph searches, or in 0(n‘^) time using block recursion and fast matrix multiplication. We show 
that this running time can be improved when 1 < a < 2 - see Fig. 

Algebraic Matrix Algorithms There are two complexity results for the computation of the determinant 
of a n X n matrix A over a finite fields^ (i) fast matrix multiplication to obtain 0{rG) time algorithrrj^or (ii) 
Wiedemann’s approach that works in 0(nm) time, where m is the number of nonzero entries in a matrix. 
We note that there are many heuristic approaches that are used in practice to speed up matrix computations, 
e.g., minimum degree algorithm [26], but these ideas do not improve the worst-case complexities that are 
stated above. Here, we are only interested in obtaining a worst-case bound on the arithmetic complexity 
of these problems and therefore we will not review this rich body of literature. We note that our approach 
is related to minimum degree algorithm, because as the first step we partition the matrix into dense and 
sparse part according to the number of nonzero entries in each row or column. However, after this step novel 
algorithms are proposed that exploit the structure of the matrix. 

We will assume that the non-zero structure of A corresponds to an PLB graph G, i.e., 7 ^ 0 if and 

only if ij G E{G). We are able to show faster algorithms for the case when 1 < a < 2. In particular our 

2 , (cu-2)(2-c.) ^ . n 

algorithm in the case of symmetric matrices works in 0{n "'"(‘^- 2 ) 0 + 3 -,..) time - see Fig. ^ for the running 
time in the case of symmetric and general matrices. 

Additionally, we show that with the same complexities it is possible to solve linear system with matrix 
A, invert matrix A, and compute PageRank of a graph represented by A. PageRank is a very simple version 
of the eigenproblem and our results could indicate that a general eigenproblem could be solved faster on 
PLB graphs. Developing such faster algorithms for eigenproblem, characteristic polynomial or even matrix 
rank is left as an intriguing open problem. 

Perfect Matching There are several algorithms known for finding perfect matching in general graphs: 
0(i/nm) time algorithm [44], OinE) time algorithm and time algorithm (42). Here, basing on 

our results for computing matrix determinant we show an algorithm that improves over these results when 
a < 1.09 - see Fig. We conjecture, however, that an improvement is possible for a G [1,2]. 

®We discuss here only the finite field case as it is the most relevant case for TCS. 

^0(n“) is the time needed for a straight-line program to multiply two n X n matrices; u> is called matrix multiplication 
exponent. Currently ui < 2.373 m- 
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Organization of the Paper The following part of this paper is organized as follows. In Section we 
introduce basic notation and show some general properties that we later use. In Section we define the class 
of PLB graphs and show their basic properties. Then, in Section we verify our definitions on real-world 
data. Section analyses very simple algorithms for counting triangles and finding maximum clique on PLB 
graphs. Finally, in Section we present more advanced algebraic algorithms for PLB graphs that compute 
the transitive closure, find the perfect matching, and compute the determinant. 


2 Preliminaries 


Let G be a graph. Throughout the paper we use n to denote the number of vertices in a graph, dk to denote 
the number of vertices of degree k, and d>k to denote the number of vertices of degree at least k. It should 
be clear from the context, which graph we refer to. We assume that the graphs we work with are simple, 
i.e., they do not contain multiple edges. In the majority of problems that we study (e.g., transitive closure 
or maximal clique) multiple edges are not important and can be simply removed. We assume that logn 
denotes the binary logarithm function. 

Lemma 2.1. Let 1 < o < 6, for a, 6 G N, and let c be a eonstant. Then 


b ro(6^+i) ifc>-i 
= < 0(log(6/a)) ifc = -l 

lo(a'^+^) «/c<—1 


Note that, throughout the paper we assume that for b < a, and any function /, J2i=a /(*) “ 
Proof. For z > a > I we have = 0(i'^). Thus, 


rb+l 


/ [xj'^dx = 0(1) 


rb+l 


= da; 


For c — 1 we have 


rb+l 


dx = 


c-l-1 


((6+1)'=+^ -a"+i) 


If c > —1, then > 0, so we we can bound the expression by 0{{b + 1)'^“''^) = 0{b‘^~^^). Otherwise, if 
c < —1, then < 0, so we can bound it by 0(a°+^). It remains to consider the case when c = —1: 

/ x'^dx= / x“^ dx = log(6-I-1) — logo = 0(log(6/a)) 

J a J a 

□ 


We also have a reverse relation: 

Lemma 2.2. Let 1 < o < 6/2, for a, 6 € N, and let c > 0 be a eonstant. Then 
Proof. 

b pb+1 I I 

/ x-“-Mx=-(a-'=-(6-fl)-0> -(a-^-(a/2-fl)-0=0(a-0 

J- ^ 

□ 
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Lemma 2.3. Let c > 0, a > 1, and (5 > 1. Then 

5 




i=l 


if c> a —\ 
if c = a — 1 
[0((< + 1)'^+^““) if c < a — I 

Proof. If c — a > —1 we simply use the fact that {i + t)~°‘ < and obtain 

s s 


i=l 


i=l 


Now consider the case when c < a — 1. 

<5 t 


By Lemma 2.1 this is equal to 0(i5'^+^ “) for c > a — 1, and equal to 0(log5) for c = a — 1. 

(5 

i—t+l 

S 

< 0(1) E(2i+ !)■“*"+ E 


E(*+^)”“*'<E(^+^)”“*' + 


2 = 1 


2=1 


2=1 


2 = t+l 

s 


= o(i)(t+i)-“E*'+ E 


2=1 

c+l- 0 !\ , _L nc+l-a'. 


= 0((t + l)'=+^-“) + 0((t + l) 
= 0((t + l)'=+i-“). 


Lemma 2.4. Let G be a graph and k > 0. The number of edges of G is at most ^>i- 

Proof. Observe that a a vertex of degree k is counted k times in the sum. Thus, the sum is equal to the 
total degree of all vertices, which is twice the number of edges. □ 


□ 


3 Power law bounded networks 

In this section we introduce our definition of a power law bounded network. There are multiple definitions 
of power law networks. Some of them state that in a power law network the number of vertices of degree k 
is proportional to k~°‘ for some parameter a [T]. In other cases power law is defined with respect to random 
graphs and only talks about expected degrees of vertices IS 12]. Both these approaches may not be applied to 
the analysis of algorithms running on real-world networks. The first one suffers from two serious drawbacks. 
First, it is often not stated in a formal way. Second, it seems that it effectively disallows even a single vertex 
with high degree. On the other hand the stochastic definition can only be applied to graphs randomly drawn 
from some distribution. This is not the case for real-world graphs, which are fixed. 

We introduce the concept of a power law bounded network, which captures the power law behavior of 
degree distribution that is necessary for the analysis of algorithms. At the same time it is weak enough to 
cover many real-world graphs. Note that this definition for t = 0 is similar to the one in [7]. The main 
difference is that we do not impose any lower bounds on the numbers of vertices of given degrees. 

Definition 3.1. Let G be an undirected n-vertex graph and ci > f) be a universal constant. We say that G 
is power law bounded (PLB) for some parameters 1 < a = 0(1) and t > 0 if for every integer k > 0, the 
number of vertices v, such that deg(u) G [2‘^,2'^+^) is at most 

2d+l_i 

cin(t-I-1)““^ E (* + i)~“- 
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In the following we say that G is a PLB graph with parameters a and t. 

Note that the {t + 1)““^ factor in the above definition is necessary to ensure that the sum of the above 
upper bounds over all k is 0{n). The above power law distribution that includes the shift by the parameter t 
is called shifted power law [23] and was observed in different real-world networks. In particular, the parameter 
t allows us to better fit the degree distributions in our experiments (see Section]^. As our experiments show, 
in the networks that we have studied the value of t is very small. However, in general it is unknown whether 
and how t depends on other parameters of the network and we are not aware of the models that would 
describe such dependence. A reasonable assumptions here seems to be that t = 0(n*^) for every e > 0. 
However, when discussing some complexities of our algorithms we will for simplicity sometimes assume that 
t = O(polylogn). Hence, the factors in the running time, that depend on t are of secondary importance. In 
the introduction when discussing our results we have assumed that t = 0 . 

The exact set of graph that satisfy Definition |3.1| obviously depends on the choice of the constant Ci. 
However, as we later show, many real-world graphs satisfy this definition for a small value of ci, i.e., at most 
5. At the same time, the running time dependency of our algorithms on ci is only polynomial. The only 
exception is an algorithm for finding maximum clique, whose running time itself is super-polynomial. 

Let us list some basic properties of PLB graphs. 

Lemma 3.2. Let G be a PLB graph with parameters a and t. Then, d>k = 0(n{t + -\-1)^~°‘) = 

0 (n(t-k 

Proof. Observe that k' = 2 L*°sO is the smallest power of 2 which is not greater than k, thus, k' < k < 2k'. 
We bound the number of vertices, whose degree is at least k' , which is an upper bound on the number of 
vertices of degree at least k. 


n-l n-l+LtJ 

cin(t-I-1)““^ ^ -f < cin(t-I-1)““^ ^ 

i=k' i=k'+[t] 

= 0(n(t+l)“-i(fc'+ 

= 0(n(f-f 


□ 

The following lemma is used, e.g., to bound the running times of algorithms, which take f{k) time to 
process a vertex of degree k, where / is at most polynomial in its parameter. Roughly speaking, it says that 
the running time of a polynomial algorithm running on a PLB network is asymptotically the same as the 
running time on a graph with an ideal power law distribution. 

Lemma 3.3. Let G he a PLB graph with parameters a and t. Let di he the number of vertices of degree i 
in G. Let / : N —>■ N he a nondecreasing function, such that for any a;,c S N, f{cx) < c^^^^f{x). Then, for 
every k > 1 we have Y!i=i difii) = 0{l)n{t + 1)““^ 
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Proof. Let us first derive an auxiliary inequality. 

i^23 i^23 

= /(2^+i) ^ d, 

2^2J 

2^2^ + ^-! 

<0(l)u(t+l)“-i ^ (* + t)-“/(2^) 

2^2d 

2^2^ + ^-! 

<0(l)n(t + l)“-i ^ (i + i)-“/(*) 

i=2i 

Note that we introduce 0(1) to hide Ci and the constant that conies from replacing /(2l+^) by /(2l). Let 
k' = — 1. Thus k <k' <2k and fc' = 2* — 1 for some integer 1. 

k k' 

2^1 2^1 

^-1 2^ + ^-l 

j=0 i=2i 

1-1 i=2^' + ^-l 

= ^0(l)n(t + l)“-i 5] (* + i)-“/(*) 

j=0 i=2i 

1-1 i=2^ + ^-l 

= o{i)n{t+ir-^Y. E (*+i)”“/(*) 

i^o 2^2 j 
k' 

= 0(l)n(t + l)“-i^(* + t)-“/(z) 

2k 

<0(l)n(t+l)“-i^(* + t)-“/(*) 

k 

= 0(l)n(t + l)“-i ^((2i - 1 + t)-“/(2* - 1) + (2z + t)-“/(2i)) 

2^1 

k 

< 0(l)n(t + l)“-i + t)-“(/(2* - 1) + /(2*)) 

2=^1 

k 

<0(l)n(t + l)“-i^(* + t)-“/(*) 

i=l 


□ 


By using Lemma |3.3| together with Lemma |2.1| we obtain the following bound on the number of edges 
touching small degree vertices. 
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Lemma 3.4. Let G be a PLB graph with parameters a and t, where a < 2. Then, the number of edges 
incident to at least one vertex of degree at most k is 0{n{t + 


Proof. The number of edges incident to at least one vertex of degree at most k is dii- 
with the identity function f{i) = i, obtaining 


We use Lemma 


3.3 


^^dii = 0{n{t + 1)°‘ + “ • i) 




i=l 

k 


= C>(n(t + l)“-ifc2-“). (1) 

In the first transformation we use the fact that (z + t)““ < i~°‘, whereas in the second one we use Lemma 


2.1 


□ 


By combining Lemma |3.2| with Lemma|2.4[ we obtain the following. 


Lemma 3.5. Let G be a PLB graph with parameters a and t. Then, the number of edges of G of is (a) 
0{n^~°‘{t + 1)““^) for 1 < a < 2, (b) 0{n\ogn{t + 1)) for a = 2, (c) 0{n{t + 1)) for a > 2. 


Proof. By Lemma 
t)““+i). We have 


2.4 


the number of edges is at most X]”=fc+i d>i. By Lemma 3.2 d>i = 0{n{t + 1)“ ^{i + 


n—1 


n—1 


^ 0{n{t + ir-\i + = n{t + 1)“-1 ^ 0((z + 


( 2 ) 


We now use Lemma 2.1 For 1 < a < 2 we have 1 — a > —1, so the sum can be bounded by 0((n +“) = 


Putting it back into Equation]^ we get 0{n^~°‘{t + 1)““^). For a = 2, we may bound the sum 
by 0(log((n + t)/t)) = O(logn), thus obtaining 0{n\ogn{t + 1)). Finally, for a > 2, we bound the sum by 
0{{t + 1)^““), so the number of edges is 0{n{t + 1)). □ 

What is interesting, for a PLB graph with 1 < a < 2, the bound on the number of edges given by 


Lemma 3.5 is not tight. In particular, the number of vertices with high degree (considerably greater than 
n^/“) is polynomially smaller. We say that a vertex is a high-degree vertex if its degree is more than 
Each edge either connects two high-degree vertices or is incident to a low-degree vertex. The number of 
edges of the first type is bounded, as there are few high-degree vertices, whereas the number of the edges of 
the second type is bounded by simply summing the degrees of low-degree vertices. Note that this reasoning 
heavily depends on the fact that the graph is simple. This is formalized in the following Lemma. 

Lemma 3.6. Let G be a PLB graph with parameters a and t, where 1 < a < 2, and k > 4- 1)^“^/“. 

Then, d>k = 0{n^-°‘{t + 


Proof. We say that a vertex of degree at least fc is a high-degree vertex. By Lemma 3.2 d>k = 0{n(t -\- 
We will use the fact that G has no multiple edges to derive a stronger upper bound on d>k- 
We first bound the total degree of high-degree vertices, which we denote by S. The edges, whose both 
endpoints have high degrees contribute at most d>k{d>k — 1) < d>fe to S. In addition, a low-degree vertex 
of degree i contributes at most imii{i,d>k). Recall that by di we denote the numb er of vertices of degree i. 
Thus, we may bound S by -I- J2i=i min(i, d>k)- We now apply Lemma 3.3 using f{i) = min(i, d>k)- 










fc -1 k -1 

dyf. min(^, d>k) < + 0{l)n{t + 1)*^“^ ^ min(i, d>k) 

2=1 2=1 


( <i>k k 

i=l i=d>fe + l 

= dl, + 0(l)n(i + l)“-i (0(4-,“) + 0(4-,“)) 

= 4fc + 0(n(i + l)“-i4l“) 


Note that when we split the sum into two sums, we use the assumed convention that for a > b, Yl\=a /(O = 0. 
We now bound d>k'- 

d>k = 0(n(t + = 0{n{t + + l)(“-i)/«). 

This gives 

dh = dlrd>k = dll^Oinit + 

Hence, the total degree of high-degree vertices is 

4fe + o(n(t +1)“-144) = o(n(t +1)“-144) 

= 0{n{t + -t i)(“-i)( 2 -«)/j(i-«)( 2 -a)) 

= + l)(“-l)(3-«)/j(l-a)(2-a))^ 


To obtain the bound on the number of high-degree vertices, we divide the obtained bound by fc, which gives 
0(n3-“(t -H l)0-l)(3-«)fca"-3a-Sl)_ □ 

Corollary 3.7. Let G be a PLB graph with parameters a and t, where 1 < a < 2, and k > 1)^-^/“. 

Moreover, assume that 1 < a < 2. Then, the number of vertices of degree between k and 2k is 0{n^~°‘{t + 

^^(a—1)(3—a) —3 o!+1^ 


Let us use Lemma [T6] to derive a stricter bound on the number of edges in a PLB graph with 1 < a < 2. 

Lemma 3.8. Let G be a PLB qraph with parameters a and t, where 1 < a < 2. Then, G has 0(n^/°‘{t -\- 
l)2-2/a) edges. 



Let 5 = -|-1)^ We 

Moreover, observe that since 


J2d>i = J2 0{n^-°^{t + i)(“-i)(3-a)4-3a-si) 

i—5 i—5 

— 0{n^~°‘{t + l)(““l)(3-Q)j^l/a(Q!^-3a-|-2)^^ _|_ (1-1 /q) (a^-3a-|-2) ^ 

^3—CK+ct—3+2/a —a^+4a—3+a^ —3a+2—a+3—2/a 

= 0(n2/“(t-h 1)2-2/“). 
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On the other hand 


s s 

i^l i=l 

= 0(n(t + + l)(l-l/a)( 2 -a)) 

= + l)a-l+2-a-2/a+l) 

= 0(n^/“(t +1)2-2/“) 

Thus, ^2^1=1 d>i = + 1)2-2/“). □ 

3.1 PLB neighborhoods 

Assume that we pick a random vertex from a power law graph with parameter a, proportionally to its 
degree. Then, the degree of the chosen vertex comes from a power law distribution with parameter a — 1. 
This implies that, roughly speaking, for each vertex n in a random power law graph, the degree distribution 
of degrees of neighbors of v also obeys power law. This fact can be exploited to obtain better running time 
bounds of some algorithms. However, the algorithms that we later give actually rely on a weaker property. 
Namely, for a vertex of degree k they only need a bound on the number of neighbors of degree at least k. 
Note that if we randomly pick k vertices proportionally to their degrees, then the number of chosen vertices 
of degree at least k is 0{{t + ^)°‘~^k'22^=k *(* + motivates the following. 

Definition 3.9. Let G he a PLB graph with parameters a > 2 and t, and let C 2 > 0 be an universal constant. 
We say that G has PLB neighborhoods if for every vertex v of degree k, the number of neighbors of v of 
degree at least k is at most C 2 max(logn, (t + 

The logn factor in the definition comes from the fact that we assume that the graph is created in a 
random way. Thus, the actual numbers of neighbors may slightly deviate from the expected values. 

Lemma 3.10. Let G be a PLB graph with parameters a > 2 and t, and PLB neighborhoods. Then, for 
every vertex v of degree k, the number of neighbors ofv of degree at least k is 0(max(logn, (t + l)°‘~'^k^~°‘)). 

Proof. We have 


C2{t + Y. 

i—k i—k 

= 0((t+l)“-2/c.A:2-«) 

= 0((t+ l)“-2/c3-“). 


Thus, C 2 max(logn, (t + 1)“ *(* + 0 “) = 0(max(log n, (t + 1)“ 2fc3 “)). 


□ 


3.2 Relation to other models 


Definitions |3.1| and |3.9| are designed to capture the properties of power law graphs that can be easily exploited 
in the analysis of algorithms. At the same time there are many random graph models that produce power 
law graphs. In these models even giving simple bounds on the degree distributions of the produced graphs is 
often highly nontrivial. The analyses of some these models [UliniEelEollIH] only give the expected numbers 
of vertices of given degrees and analyze the concentration. A typical concentration statement says that (with 
high probability) the number of vertices of degree k differs from the expected value by some small additive 
error (e.g., y/n\ogn). This cannot be directly used to show that these graphs satisfy Definition 3.1 Proving 


that would require bounding the number of vertices of degree belonging to [2'^, 2^^+^), but if we simply sum 
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the approximate numbers of vertices of each degree G [2^^, 2“^+^), the additive errors accumulate. At the same 
time we believe that many of the proposed random graph processes yield PLB graphs, but proving this is a 
challenging open problem. 

Another models for power law graphs are based on fixing a degree sequence in the beginning. In the 
erased eonfiguration model [siiaiii] the degrees of all vertices are fixed in the very beginning to obtain an 
almost ideal power law distribution. Then a graph is picked uniformly at random, among all graphs that 
have the given degree sequence. Note that we fix an “ideal” power law degree sequence, as it is done, e.g., 
in imiiiiz], but in some works on this model the degree of each vertex is picked independently at random. 


Theorem 3.11. Let n be sufficiently large and G be a random power law graph with parameter a > 1 created 
by erased configuration model. Then G is a PLB graph with parameters a and t = 0. Moreover, with high 
probability, G has PLB neighborhoods. 


These statements are true for some universal constants Ci and C 2 (see definitions 3.1 and 3.9). 

The remaining part of this section gives a proof of Theorem |3.11| Let us now describe the erased 
configuration model in detail. First, we pick a degree for every vertex, in such a way that the number of 
vertices of degree k is 0(n/fc“). Since the sum of all degrees has to be even, we add one vertex of degree 1 if 
necessary. For simplicity of the analysis we ignore this added vertex. Then, we build a random graph with 
the chosen degree sequence as follows: 


1. Build a complete graph H containing deg(u) copies of vertex v. 

2. Choose a random perfect matching in H and remove the edges that are not in the matching. 

3. Build G from H by merging the copies of each vertex. 


The resulting graph may have multiple edges or self-loops, which we remove. 

It follows easily that the maximum degree in G is 0{n}/°‘). We now verify that G satisfies Definition 
We have 

2 £ i + l_i 2 '^+^ — ! 2 “^+^ —1 2“*+^—1 

di < 0{n/i°‘) = 0{n) < cin 

i=2‘i i=2<^ i=2'^ i=2'^ 

for some universal constant ci. Thus, G is a PLB graph with parameters t = 0 and a. 


3.1 


The proof that, with high probability, G has PLB neighborhoods (satisfies Definition 3.9) is more involved. 
Let us now assume that a > 2 and fix a vertex v of degree k. Our goal is to bound the number of neighbors 
of V of degree at least k. 

Vertex v has k copies in H, that we denote hy vi,... ,Vk. We say that a vertex of H is bad if it is a copy 
of a vertex of degree at least k, but not a copy of v. Let us define a sequence of Boolean random variables 
Xi ,..., Xk , where Xi = 1 iff. Vi is matched in H with a bad vertex. Note that matching Vi with another 
copy of V does no harm, as this creates a self loop in G, which is then removed. Thus, upper 

bound on number of neighbors of v of degree at least k (in G). The number of bad vertices is bounded by 


n—1 n—1 

<nY 

i—k i—k 

Thus, P{Xi = 1) < Gfc^““, for some universal constant G, as P{Xi = 1) is bounded by the probability of a 
randomly chosen vertex being bad. Define X = Yi=i follows that E(X) < Gk^~°‘. 

We now use Chernoff bound to bound X. The variables Xi are not independent, but they are negatively 
associated, which suffices for the Chernoff bound to work (see e.g. [21)1. 

Lemma 3.12. For any set L C {1,...,A;}, P(/\jgj = 1) < = 1); ^^o,t is variables Xi are 

negatively associated. 
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Proof. Assume the number of bad vertices is b. Then P(Ai = 1) = b/{n — 1). 

Observe that the perfect matching in H can be computed as follows. We go through the vertices in any 
order. For each vertex, if it is already matched, we skip it. Otherwise, we match it to a randomly chosen 
unmatched vertex. 

For the purpose of the proof, we may assume that the first vertices that are chosen in this process are 
vi,... ,Vk. In the i-th step, we need to compute the probability that Vi is matched to a bad vertex, provided 
that vertices ui,..., Vi-i have been matched to a bad vertex. This probability is clearly [b — i + l)/(n — 1 — 
2i + 2) < b/{n — 1). Thus, P(/\jgj Xi^ = 1) < (V~ 1))'^' = IliG/ = !)■ The lemma follows. □ 


In our proof we use the following version of the Chernoff bound. By Lemma |3.12 it can be applied to the 
random variables Xi ,..., X^. 


Theorem 3.13 1[21]1. Let X = where 0 < < 1 and Xi are negatively assoeiated. Let t > 

2eE(A). Then P(A > t) < 2"*. 


We now proceed with the main part of the proof of the second claim of Theorem |3.1H which states that 
G has PLB neighborhoods (with high probability). We show that the property of Definition 3.9 holds for a 
single vertex with high probability and then use union bound. 

Set t = max(clogn, 2eE(A)). By Theorem 3.13 we have P(A > t) < 2“* < n~‘^. Thus, with high prob¬ 
ability a vertex with degree k has at most t neighbors. We now show that t < C 2 (max(logn, 
for some universal constant C 2 . The case when t = clogn is trivial, as w e may set C 2 = c. Now, assume 


k 


2-a 


< c"Er=- 


"-1 -l-Q 


for some 


that t = (logn)2eE(A). We have that E(A) < Ck^~°‘. By Lemma |2.2[ 
universal constant C. The Lemma requires that fc < (n — l)/2, which follows from the fact that the degrees 
are bounded by 0(n^/°‘). Hence, 


n—1 

t = 2eE(A) < 2eCk^-^ = 0{k ^ 

i—k 


which completes the proof of Theorem |3.11[ 


4 Real-World Networks are Power-Law Bounded 

In this section we verify our definitions from Section]^ on real-world networks. The majority of our graphs 
comes from Stanford Large Network Dataset Collection [?D]. In addition, we analyze the global flights 
network m. as well as WIW social network degree distribution [T^{^ 

First, we focus on Definition |3.1[ We compute the degree distributions of each network and then try 
to choose the parameters ci,a and t, so that our bound on the number of vertices of given degree is as 
tight as possible. At the same time we ensure that Ci is at most 5 (as it is supposed to be a constant) 
and try to maximize a, since larger a implies better running time bounds of our algorithms. The results 
of this adjustments are shown in Table Observe that the value of t is very small compared to n. Some 
of the graphs in the data sets are directed. For such graphs we make two adjustments. Either we drop the 


and only consider the outdegrees of vertices (“(directed, out-degree)” in Table[^. For some of the networks, 
in Fig. [^we also show the degree distribution, as well as the bound of Definition |3.I| In order to show the 
data with more detail, we plot not only the numbers of vertices, whose degrees belong to [2'^,2^+^) (actual 
and the upper bounds of Definition |3.1[ ), but also the numbers of vertices of degree belonging to [fc, 2k) for 
each 1 < fc < n. 

In the case when a < 2, in Fig. ^ the bound from Corollary |3.7| is marked with green line. For Epinions 
and WikiTalk graphs the critical degree, when the second power law starts, is predicted rather well. Note 
that this is not the case for Facebook graph as the maximum number of friends one can have is limited to 

^We thank the authors of m for sharing with us this data. 


orientations of the edges (“(directed, in-degree -h out-degree)” in TableQ, or we slightly modify Definition 3.1 
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5000. The high degree part of the distribution is cut off at this number. While the critical degree is predicted 
decently, the slope of the second power law distribution is underestimated. This is most probably due to the 
worst-case form of our bounds which are overly pessimistic with respect to the actual trend. 

Then, we move on to Definition |3.9[ For each network we use the previously computed parameters ot and 
t and find the smallest value of C 2 , for which the definition is satisfied. We skip the graphs, where a < 2, 
as Definition |3.9| does not apply to them. The values of C 2 obtained this way are also shown in Table 
Observe that for every network the computed value of C 2 is less than 8.06, and for a big majority of them it 
is less than 2. This confirms that the property of Definition 3.9 is indeed present in real-world graphs. 

Table [l] also contains two adjustments, in which we force the value of t to be 0. In some sense this is 
similar to fitting the standard definition of a power law distribution to our data. However, this causes the 
value of a to increase and makes our bounds much further from the real data, as shown in Fig[^ 


5 Counting Triangles and Maximal Clique 

This section presents our first two algorithms for PLB graphs. The first algorithm counts triangles, whereas 
the second one returns the size of the maximal clique. The algorithms themselves are easy and should be 
considered folklore. However, we show that in the case of PLB graphs they perform much better than in the 
case of general graphs. Then we obtain even better running time bounds for graphs with PLB neighborhoods. 
This is the most important contribution of this section, as we believe that it gives the first solid explanation 
of the good performance of triangle counting and maximum clique algorithms in real-world graphs. 

Both our algorithms are based on the same construction. We first direct the edges of G towards vertices 
of higher degree. Formally, let m,..., be all vertices of G sorted in non-decreasing order of degrees. We 
define G to be a graph obtained from G by directing each undirected edge ViVj towards fmax(ij)- Note that 
since the degrees of vertices are bounded by the number of vertices, we may sort the vertices and build G in 
linear time. Moreover, note that G does not contain any cycles. Let b{k) be the maximum out-degree in G 
of a vertex of degree A: in G. 

Note that the value b{k) is related to graph degeneracy. We say that a graph is d-degenerate if every 
subgraph has a vertex of degree at most d. In our case G is d-degenerate for d = maxi=i_...^ji_i b{i). In 
d-degenerate graphs we can count triangles in 0{dm) time |TH]. Since PLB graphs are 0(n^/“)-degenerate 
(assuming t = 0), this can be used to obtain a running time bound of 0(mn^/“), which is the same as the 
running time given in m- However, with a slightly more careful analysis, in this section we improve this 
bound. While this result is simple, to the best of our knowledge it has not been previously stated explicitly. 
We first use the bounds derived in Section]^ to bound b{k). 

Lemma 5.1. Let G be a PLB graph with parameters a and t. Then b(k) = 0(min(fc, d>k)) = 0(min(A:, n(t + 
-b 1)^"^/“). 

Proof. Obviously b{k) < k, since for every v G I^(G), we have outdegg(z;) < deg( 3 (w). In addition to that, 
since the edges are directed towards vertices of higher degree, b{k) < d>k- The second inequality follows 
directly from the bound on d>k derived in Lemma |3.2[ 

It remains to show that 0{mm{k,n{t + )) = Assume that k > n^^°‘{t + 

n{t + l)“-ifci-“ < n{t + l)“-ini/“-i(t -b l)2-“-i/« = 


as desired. The lemma follows. 


□ 


If our graph additionally has PLB neighborhoods (see Definition 3.9), we may obtain a better bound. 


Lemma 5.2. Let G be a PLB graph with parameters a > 2 and t, and PLB neighborhoods. Then b{k) = 
0(min(n(t -b l)“-iA:i-“, logn -b (t -b = 0(logn -b (t -b l)“/2-i/2n^/^-“/^) 
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3.10 


Thus, by Lemma 

Thus, we get b{k) = 0(min(n(t 


Proof. Observe t hat b (k) is at most the number of neighbors of degree at l east k in the neighborhood. 

blk) = 0(max(logn, (t + By Lemma 5.1 we also have b{k) = 0{n{t + 

logn + (t + 1)““^A:^““). 

To balance the terms we take k = (t + l)n. Thus, 0(min(n(t + l)““^/c^““,logn + (t + l)““^fc^““)) = 
0 (logn + (t + l)““^((f + l)n)^/^““/^) = 0(logn + {t + l)“/2-i/2f^3/2-a/2 ^^ q 


5.1 Counting Triangles 

We now show efficient algorithms for counting triangles in an undirected PLB graph G with parameter a. 
Their pseudocodes are given as Algorithms [l] and The first algorithm is clearly structure-oblivious. The 
second one also does not use the structure of the graph explicitly. However, it takes a parameter <5, which 
will depend on graph parameters a and t. Observe that Algorithm can easily be extended to list triangles 
in the same running time bound. 


Algorithm 1 Structure-oblivious algorithm for counting triangles 
1; function CountTriangles(G) 

2 : Construct G 

3: triangles := 0 

4: for V G V(G) do 

5: S := set of endpoints of outedges of v 

6: for each inedge wv of n in G do 

7: for each outedge wu of w in G do 

8: if M G 5" then 

9: triangles := triangles + 1 

return triangles 


Algorithm 2 Algebraic algorithm for counting triangles 
1; function CountTrianglesFMM(G, <5) 

2 : Construct G 

3: triangles := 0 

4: for V G V (G) do 

5: S := set of endpoints of outedges of v 

6: for each inedge wv of n in G do 

7; if deg( 5 (ic) < 6 then 

8: for each outedge wu of ic in G do 

9: if M G S' then 

10; triangles := triangles + 1 

11; Gs ■= subgraph of G induced on vertices of degree more than 6 

12; return triangles + the number of triangles in Gs^ counted using fast matrix multiplication 


Lemma 5.3. Algorithms CountTriangles and CountTrianglesFMM are correct. Their running times 
are dib{i)'^), and 0{J2i=i dib{i)^ + d^g), respectively. 

Proof. Let us first consider the running time of CountTriangles. Observe that the body of the for loop in 
thej^h line is ran exactly once per each edge of G. Thus, the for loop in thej^line is ran at most b{deg{w)) 
times for a vertex w. In other words, the line is executed for each pair of vertices u and v, which are 
endpoints of the outedges of w. This requires dib{i)‘^) time. 

The set S can be implemented as a Boolean array. This way we can initialize the set each time in linear 
time. Moreover, we can test for membership in constant time. Moreover, as observed before G can be 
computed in linear time. Thus, CountTriangles runs in dib{i)‘^) time. 
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Concerning correctness, let vi,...,Vn be all vertices of G sorted in non-decreasing order of degrees. 
Consider a triangle T. Let v is the vertex of T that comes first in the sorted order vi,... ,Vn- Then, in G 
the two edges of T that are incident to v are out-edges of v. Thus, the correctness of CountTriangles 
follows. 

Using similar arguments, we may observe that the first stage of CountTrianglesFMM (lines [4 -10) 
correctly identifies exactly the triangles that contain at least one vertex of degree at most 6. Clearly, Gs 
contains exactly the triangles that have not been identified yet. Since Gs has exactly d>s vertices, the 
running time of CountTrianglesFMM follows. □ 


We now combine the algorithms with the bounds on b{k) derived in lemmas 5.1 and 5.2 to obtain four 
running time bounds of our algorithms, that depend on the algorithm used and on whether the graph has 
PLB neighborhoods. These running times are shown in Fig. 

Theorem 5.4. Let G = {V,E), n = \V\ be a PLB graph with parameters a and t. Then, algorithm 
CountTriangles can compute the number of triangles in G in: (a) 0(n^/“(t-|-l)^“^/“) time for 1 < a < 3, 
(b) 0{n\ogn{t + 1)^) time for a = 3, (c) 0{n{t -|- 1)^) time for a > 3. 


Proof. By Lemma 5.3 the running time is 0(X]r=i^ dib{i)‘^). Let 1 < i5 < n be a parameter that we fix later. 
We split the sum into two pieces and first bound X]i=i diO{b{i)‘^). By Lemma 3.3 this can be upper bounded 
by 


0{l)n{t + l)°‘ ^'^{i + t) 




2=1 


There are now three cases to consider, depending on the value of a. For a > 3 we have 




so the running time is 0{l)n{t- 
set S = n — 1. 

For a < 3, we have 


i—1 

■ l)““^(t-|- 1)^““ = 0{{t + l)’^n), regardless of the choice of 6. Thus, we may 


0{l)n{t + l)“-i < 0{l)n{t + l)“-i * 


2-a 


i=l 


i=l 


For 0 = 3 this gives 0(nlog(<5)(t -I- 1)^). Again, we set ^ = n — 1 and obtain a running time of 
0{n\ogn{t + 1)^). The last case is when 1 < a < 3. Then, the sum is equal to 0{n{t + 1)““^(5^““). 

In this case we set 6 < n — 1, so we still need to bound ^"7^4-1 di O{b{ i)‘^). We use the fact that 
b{i) = 0{d>i) (see Lemma 5.1) and d>s = 0{n{t + (see Lemma 3.2): 

n—1 n—1 n—1 

^ d.O(6(z)2)= ^ d,dh< Y. d.45 = 0(45) = 0(n3(t + l)3(“-b<53(i-a)) 

2 = 5+1 2 = 5 + 1 2=5+1 


The overall running time is 0{n{t + 1)“ “ -|-n^(t-|- 1)^^“ jn order to balance the summands, 

we set (5 = n^/“(t -|- 1)^“^/“ and obtain the running time of 0(n^^°‘{t + 1)^“^/“). □ 

Theorem 5.5. Let G be a PLB graph with parameters a and t, where a < 3. Then algorithm Count¬ 
TrianglesFMM with 5 = (n{t + compute the number of triangles in G in 

0{{n{t-\- 45/(0.45-i-q)^ time. 
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Proof. By Lemma 5.3 the running time is + In the proof of Theorem 

that for 1 < a < 3, dib{i)^) = 0{n{t + By Lemma 


3.2 


5.4 


we have shown 
d>s = 0{n{t + The 


running time becomes 0{n{t + 1)“ “ + n‘^{t + “))_ yve balance both summands: 


^3-a-uj+aui _ 


S = {n{t + l)“-l)(“-l)/(3-a-<^+«u;) 


and obtain a running time of 

0{n{t + 1)“-1(53-«) = o((n(t + l)“-i)(n(t + i)“-i)(3-«)(<^-i)/(3-«-<^+«<^)) 
= 0{{n{t + X)“-l)(3-a)(‘^-l)/(3-a-t^+a<^) + l^ 

= 0((n(t + i)“-i)2‘^/(3-«-‘^+«‘^)) 

Setting oj = 2.38, we get 

0((n(t + l)“-l)4.76/(0.62+1.38a)^ ^ ^^a-l)3.45/(0.45+a)^_ 

For a = 2, 3.45/(0.45 + a) < 1.41, so the running time becomes + l)^'"^^). 

If G additionally has PLB neighborhoods, we may obtain a faster algorithm. 


□ 


Theorem 5.6. Let G he a PLB graph with parameters a and t, where a < 3. Moreover, assume that G has 
PLB neighborhoods. Then, algorithm CountTriangles can compute the number of triangles in G in time 
(a) + i)3/2a-3/2^ 2 < a < 7/3, (b) 0{n(t + 1)^) time for a > 7/3. 


Proof. By Lemma 5.3 the running time is 0(^/1”/ dib{iY). By Lemma 5.2 h{k) = 0{m.ui[n{t+l)°‘~^lp~°‘, logn+ 
{t + l)““^fc^““)). In particular, h{k) = 0(logn + (t + l)““^fc^““). 

Again, split the sum using a parameter 5 that we fix later. 

We first bound the sum of the first 5 summands (we use Lemma 3.3): 

s s 


i=l 


dib{iY = 0{l)n{t + 1)““^ &(*)^(* + t)~°‘ 

i=l 
S 

= 0{n{t + 1)“-1 n+{t+ l)2“-4j6-2a)( ■ 

i=l 

S S 

= 0(n(t + 

S 

= 0(n(t + l)3“-5((t + 1)1-“ log" n + (J] ^®-"“(^ + t)-“)) 

S 

= 0(n log" n(t + 1)"“-'! + n(t + l)3“-5 + t)”“) 


2 = 1 


We use Lemma 2.3 If a > 7/3, the sum is bounded by 0(((5 + 1)'’ 

1 ) 2 ). 


). Thus, if we set d = n — 1, the 


running time becomes 0(n(t ■ 

It remains to consider the case when 2 < a < 7/3. Then, we assume that 5 = (which we can do, 
since we are free to choose <5). The sum can be bounded by so X]i=i dib{iY = 0(n(t+l)^“-^(5'’-^“). 
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To obtain the running time we still need to bound proof of Theorem 


5.4 


we 

have shown that — 0{n^{t + We balance n{t + i)3a-5j7-3a _|_ 

l)3(a-l)^3(l-a). 

n{t + l)3«-5^7-3a ^ ^3(^ ^^ 3 („_i)^ 3 (i_„) 

5^ =n^{t +If 
5 = \/n{t + 1) 

The running time becomes + i)3/2a-3/2^^ q 

Thanks to Theorem 3.11[ Theorem |5.6| applies (whp) to graphs generated by erased configuration model. 
Thus, this algorithm generalizes and strengthens the result of Berry et al. [7] by showing whp bounds on the 
running time instead of bounds in expectation. 

Theorem 5.7. Let G he a PLB graph with parameters 2 < a < 7/3 and t. Moreover, assume that G has PLB 
neighborhoods. Then algorithm CountTrianglesFMM for S = j^(‘^-i)/(7-<^+('^-3)a)j-^_l_]^^i-2/(7-i^+(i^-3)a) 
ean compute the number of triangles in G in O(n*^^3.04-7.68a)/(7.46-a)^^^ _l_ 2)(7.67(a-i))/(7.46-a)^ time. 


Proof. By Lemma 5.3 the running time is dib{if + d^f . In the proof of Theorem 5.6 we have shown 

that for 2 < a < 7/3 and <5 = dib{if = 0{n{t + l)3““3(57-3a)_ Qj^ other hand, as in the 


proof of Theorem 5.5 it takes 0 ( 71 “^(f + “)) to process vertices of degree at least (5. We balance 


both times to find the optimal choice for S. 

n{t + l)3«-557-3a ^ 

g7-uj+{uj-3)a _ _|_ 2^^(<.J-3)a-tj+5 

g — ^(‘^-l)/(7-t^+(i^-3)a)^^ _|_ Y^l-2/{7-ui+(uj-3)a) 

By plugging this back, we obtain the running time of 

_ Q^.^uj{ui—uia—l+a+7—uj+(ui — 3)a)/{7—Lj+{uj — 3)a)^.i. _|_ (2 i . j(q —l))/(7——3)a) ^ 

= 0(7i(2w(3-a))/(7-(^+(a;-3)a)|'^ _|_ (2w(a-l))/(7-w+((.j-3)a) ^ 

Settings = 2 . 38 , wegetO(n(14-28-4.76a)/(4.62-0.62a)(^^;^)(4.76(a-l))/(4.62-0.62a)) ^ O(n(23-04-7.68a)/(7.46-a) 

2^^(7.67(a-i))/(7.46-a)^^ p^j. q, 2+ this becomes 0{n^ '^^{t + 1 )^''^^). For a —>■ 7/3, it is 0{n{t + 1 )^). □ 

5.2 Finding Maximal Clique 

We now show an efficient algorithm for finding the largest clique in PLB graph. 


Algorithm 3 Maximal clique algorithm 
1; function MaximalClique(G) 

2 : Construct G 

3: maxclique := 0 

4: for V £ V (G) do 

5: Ny := {u} U set of endpoints of outedges of u in G 

6: for S C Ny do 

7: if S' is a clique in G then maxclique := m.ax{maxclique, |S|) 

return maxclique 


17 














Lemma 5.8. Algorithm MaximalClique is correct, structure oblivious and runs in exp(6(z)) time. 

Proof. Let C be a clique in G. Then, C contains a vertex w such that in G ic has directed edges to every other 
vertex of C. The correctness of the algorithm follows easily. It is also easy to see that it is structure-oblivious. 

Consider the iteration of the outer for loop for a vertex v. The size of Ny is bounded by b{deg{v)) -b 1, 
so the inner for loop runs in exp(6(deg(u))) time. For a single v this can be crudely upper bounded by 
^xp{b{i)). The outer loop has n iterations, so the entire algorithm runs in exp(6(i)) time. □ 


Theorem 5.9. Let G be a PLB graph with parameters a and t. Then, algorithm MaximalClique can find 
the largest clique in G in exp(0(n^/“(< -|- 1 )^“^/“)) time. 


Proof. By Lemma 
l)l-l/a)^ Thus, 


5.8 


the running time is poly(n) exp(6(j)). By Lemma 5.1 b{k) = 0(n^/“(< -b 


n— 1 

n exp(&(*)) = nexp( 0 (ui/“(t -b 1 )^- 1 /“)) = exp(0(ni/“(t -b 1 )^-^/“)). 
1=1 


□ 


This problem can also be solved more efficiently for a PLB graph with parameter a > 2 
neighborhoods. 


and PLB 


Theorem 5.10. Let G be PLB graph with parameters a > 2 and t and PLB neighborhoods. Then, algorithm 
MaximalClique can find the largest clique in G in (a) exp(0((t -b logn)) time for 2 < 

a < 3, (b) time for a = 3, (c) 0(poly(n)) time for a > 3. 


Proof. By Lemma 

0(logn((t-bl)“/2- 


5.8 


the running time is poly(n) exp{b{i)). 

3/2-“/2 + i)). Thus, 


Moreover, by Lemma 


5.2 


b{k) = 


n—1 

n ^ exp(6(i)) = r? exp(0(log n((t -b 1)“/2 -i/2^3/2-q/2^ t l)) = exp(0(logn((t -b l)“/2-i/2j.j3/2-a/2^ -b 1)). 

i=l 

If a < 3 this can be simplified to exp(0((t-bl)“/^“3/2,.j3/2-Q;/2 Pqj- a = 3, n3/2-a/2 _ 0(1), so we get 

nOd+P_ For o > 3, we use the fact that t-b 1 = 0{n^) for every e > 0. Thus, (t-b l)“/ 2 “i/ 277 , 3 / 2 -a /2 _ 0(1), 
so the running time becomes exp(0(logn)) = poly(n). 

□ 


Observe that for a > 3 the running time is polynomial in n. Note that the analysis assumes t = 0{n'^). 


6 Algebraic Algorithms 

In this section we will give our algebraic algorithms for computing matrix determinant and solving linear 
systems of equations. As already mentioned we will be working over a finite held iP. For a warm-up we will 
start from the generic symmetric case and next we move on to general non-symmetric case. In this section 
when we talk about directed PLB graphs we assume that only the outdegrees of vertices satisfy a similar 
bound to the one given in Dehnition |3.1[ Moreover, we will use fast rectangular matrix multiplication. We 
denote by uj(n, m, k) the time needed to multiply an n x m matrix by an m x A; matrix |39j . 
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6.1 Transitive Closure 

Let us start by giving our algorithms for computing transitive closure of a graph. 

Theorem 6.1. Let G he a directed PLB graph with parameters a and t, and let 1 < k < n^^°‘(t + 1)^“^/“. 

Then, we can compute the the transitive closure of G in 0{n^{t + + uj{n,n,n(t + 

time. 

Proof. Let M be the adjacency matrix of G. We start with sorting the rows of M in decreasing order 
according to the number of non-zero entries. Then, for a given k G [0,n], we split the matrix M into 4 
submatrices 


M = 


A 

G 


B 

D 


where [A S] contains rows with more than k non-zero entries, [C I?] has at most k non-zero entries in 
each row and A, D are square matrices. Let mcD be the total number of non-zero entries in submatrices 


we have that is bounded by 0{n{t + 1)“ “). 

We can express the transitive closure of M in the following block form 


C, D. By Lemma 3.4 mcD is bounded 0{n{t + 1)“ ^k^ “). Let be the dimension of A. By Lemma 3.2 


'A 

B 

* 

/ O' 

\A-BD*C)* 

0 ■ 

L -BD*' 

G 

D 


-D*G L 

0 

D* 

0 / 


In order to compute the transitive closure using this equation we compute: 


• D* in OlnmcD) time executing n graph searches; 

• D*G in 0(nmcD) time using sparse matrix multiplication; 

• B{D*G) in 0{nn‘f~^) time using fast matrix multiplication; 

• BD* in 0{n‘f) time using fast matrix multiplication; 

• {A — BD*C)* in 0{n‘f) time using fast matrix multiplication; 

• both matrix multiplications from ([^ in 0{uj{n,nk,n)) time, as BD* and D*C have one dimension of 
size 0{nk). 


The theorem follows by plugging the bounds for mcD and Uk to the list above. 


□ 


6.2 Determinants of Symmetric Matrices 

Let us start from the Lanczos’ algorithm, which is useful when dealing with sparse matrices. 

Theorem 6.2 ([8l [27]). There is a randomized algorithm, which for a given generic symmetric square 
matrix A in time 0{nm) computes det(A) together with matrices Q,T, such that A = QTQ"^, where T is a 
tridiagonal matrix, n is the dimension of A and m is the number of non-zero entries in A. 

Next, we show an algorithm computing a determinant of a matrix M, corresponding to a given PLB 
graph G. A symmetric matrix M can be seen as corresponding to the case when G is undirected. 


Theorem 6.3. Let G be a directed PLB graph with parameters a < 2 and t. Let M he a generic symmetric 
matrix, whose non-zero entries are a subset of non-zero entries of the adjacency matrix ofG. Then, we can 


compute the determinant of M in 0{{t -\- 1)' 




(c..-2)(2-c) 


time. 
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Proof. Similarly as in the proof for transitive closure, we start with sorting the rows of M in decreasing 
order according to the number of non-zero entries (which is upper bounded by the degree of a corresponding 
vertex of G). Then, for a given k G [0,n], we split the matrix M into 4 submatrices 


M = 


A 

C 


B 

D 


where [A i?] contains rows with more than k non-zero entries, \C I?] has at most k non-zero entries 
in each row and A, D are square matrices. Let mscD be the total number of non-zero entries in submatrices 
B,C,D. As the matrix is symmetric itibcd is at most twice the number of non-zero entries in [C D]. 


This in turn, by Lemma 3.4 is at most 0(n{t + 1)“ "^kf “) 


By using the formula for the determinant of a Schur complement we obtain 


det(M) = det(i:>) det(A - BD-^C). 

Let Uk be the dimension of A (which depends on k). By Lemma |3.2| we infer that rik is bounded by 
0{n{t + which in turn gives 

rife =0(n(f-I-, (4) 


as a > 1. 

By invoking Theorem 6.2 we compute det{D) as well as matrices Q, T such that D = QTQ^ . The running 
time needed is 0{nmBCD)- Denote Z = A — BD~^C = A — BQTQ^C . To compute Z efficiently, we first 
compute Q^C and BQ in time 0{nmBCD) each, as both C and B are sparse, i.e., have at most tubcd 
non-zero entries. As T is tridiagonal computing T{Q"’"C) takes time proportional to the size of that 

is n ■ rik- Finally we multiply BQ by T[Q"^C) in time 0(n • by partitioning the matrices into n/rik 

submatrices of size rik x rik each and invoking fast matrix multiplication on square matrices. Finally, after 
computing Z, we can compute det(Z) in time 0(n%). 

Summing up, we have to set the value of k to minimize the maximum of four values 


0(nmBCD) time used by invoking Theorem 


6.2 


and for computing the products BQ, 

• 0{n ■ Uk) time for computing the product T ■ {Q"’"C), 


0{n ■ nf. ) time for computing the product {BQ) ■ {T{Q^C)), 


• 0{nf.) time for computing det{Z). 

Note that n ■ nf,~^ dominates both n ■ rik and n‘f. Therefore, we need 

we set k = t(‘^-2)a+3-i.;.^yr=2ys+3=ir^ 

theorem. 


rriBCD = nf 


By using ll 


11 and ( 


to set the value of k, so that 

which finishes the proof of the 

□ 


6.3 Determinant of General Matrices 


In the general case we use the following results due to Eberly |22j . who showed how the Frobenius normal 
form of a sparse matrix can be computed. Frobenius normal form Fa of a matrix A is a block diagonal matrix 
with companion matrices of monic polynomials fi, ■ ■ ■, fk on the diagonal, where fi is divisible by /i+i, for 
1 < * < k—1 and VAV~^ = Fa- The companion matrix of a monic polynomial x‘^+gd-ix‘^~^ + .. .+giX+gQ G 
is a d X d matrix defined as 

rO ... 0 -go 

1 ... 0 -gi 




_0 ... 1 

The polynomials fi,..., fk are the invariant factors of A 


-gd-i_ 

and k is the number of invariant factors. 
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Theorem 6.4 ( \22\). There exists an algorithm for computing Frobenius normal form F of the matrix A 
together with the transition matrix V and its inverse with use of 0{n) matrix-vector products and 0(kn^) 
additional operations, where k is the number of invariant factors of A. The algorithm is randomized and 
may fail with arbitrarily small probability. 


We will use the following preconditioning due to [51] to make sure that there is just one invariant factor 
with high probability. An n x n Hankel matrix F[ is constructed from 2n — 1 elements hg,, h 2 n -2 in the 
following way 

ho hi ... hn-2 hn-i 

hi h2 ... hn-i hn 


H = 


hji—i 


h2n-3 h2n-2 


We note that multiplication of a matrix by Hankel matrix FI takes O(n^) time |5]. Similarly, computing 
or multiplying a matrix by H~^ takes 0{n^). 


Theorem 6.5 (Theorem 2 from [34] 1. Let A be the non-singular square matrix A, let H be a Hankel matrix 
with elements selected randomly and uniformly from IF, then all leading (or trailing) principal submatrices 
of A = AH are non-singular with high probability. 

Theorem 6.6 (Equation (1) from [34] 1. Let A be matrix such that all its leading principal submatrices 
are non-singular, let J be a diagonal matrix with elements selected randomly and uniformly from F, then 
A = AJ has one invariant factor with high probability. 


Theorem 6.7. Let G be a directed PLB graph with parameters a and t, and let M be a matrix, whose non¬ 
zero entries are a subset of non-zero entries of the adjacency matrix of G. Moreover, let 1 < k < + 

Iji-i/a, Then, one can compute the determinant of M in 0(n^(t + + w(n, n, n(t + 

randomized time. 


Proof. Similarly as in the symmetric case, we start with sorting the rows of M in decreasing order according 
to the number of non-zero entries Then, for a given k € [0,n], we split the matrix M into 4 submatrices 


M = 


A 

G 


B 

D 


where [A i?] contains rows with more than k non-zero entries, \C I?] has at most k non-zero entries 
in each row and A, D are square matrices. Let mcD be the total number of non-zero entries in submatrices 
G,D. By Lemma rncp is bounded 0{n{t-\- 

Let X be and arbitrary nx n matrix. A submatrices of X obtained by performing similar split as for M 
are denoted by 


Aa Xb 
Xc Xd 


Let H and J be random matrices as given in Theorem [6.51 We cannot afford to precondition the whole 
matrix, so we precondition only the essential part that is needed for the Schur complement to work. 


'A 

B 

'I Hb 


'A 

AHb + BHd 

C 

D 

0 Hd_ 


G 

GHb + DHd 


You may observe that AHb + BHd = {MH)b and GHb + DHd 
all trailing principal submatrices of GHb + DHd are non-singular, 
preconditioning. 


= {MH)d. Hence, by Theorem 6.5 


Now we apply the second part of the 


7 0 ■ 


'A (AHb + BHd)Jd 

O 


G (GHb + DHd)Jd_ 
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By Theorem 6.6 the matrix Me, = {CHb + DHb)Jd has one invariant factor. By using the formula for the 


determinant of a Schur complement we obtain 


det(M) = det(M) det(A - Mb{Md)~^C) . 


Let Uk be the dimension of A . By Lemma 3.2 we have that is bounded by 0{n{t + 

By invoking Theorem 6.4 we compute matrices V,F such that Mb = V~^FV, as well as det(M£)) = 
det(F) in time 0{nmcD)- 

Denote Z = A — Mb{Mb)~^C = A — MbV~^F~^VC. To compute Z efficiently, we first compute 
Mb = {AHb + BF[b)Jd = {A,B){Hb,Hb)"'"Jd what requires O(n^) time. Then computing MbV~^ 
requires 0{uj{n,nk,n)) time. Next, we compute VC in 0{nmcD) time because C has at most mcD non¬ 
zero entries. 

Due to special structure of F we know that F~^ has 0(n) non-zero entries and computing it takes 0{n) 
time using the following form for each companion matrix 




-9i 1 0 

-92 0 1 


-gd-i 0 0 

-10 0 


As F ^ has 0{n) non-zero entries so computing F ^VC takes time proportional to the size of VC, that 
is n • Uk- To obtain Z we multiply MbV~^ by F~^VC in time 0{n/nk ■ n%)- After computing Z, we can 
compute det(Z) in time 0{n%). Finally, by our preconditioning det(M) = ’ where we need 0{n) 

time to compute (VA^FIbJd)- 

Summing up, we have to set the value of rj to minimize the maximum of the following 


0{rV) for computing BHJ, DHJ and HJV 


-1 


• 0(nmcD) time used by invoking Theorem 6.4 and for computing the product VC, 

• O{oj{n,n,nk)) time for computing B{HJV~^), 

• 0(nn/.) time for computing the product F~^ ■ VC, 

• 0(nn‘^~^) time for computing the product BHJV~^ ■ F~^VC, 

• 0{n‘^) time for computing det(Z), 

• 0{n) time for computing det(iLJ) and F~^. 

Note that 0{u}(n,n,nk)) dominates , nuk and n^, whereas 0{nmcD) dominates Therefore 

we need to set the value of rj, so that nmcD = oj{n, n, Uk)- □ 


The above theorem gives a general statement that in the parameter range 1 < a < 2 it is possible to 
compute determinant faster than by using algorithms for general graphs. However, the statement of the 
theorem contains tangled equation, so in order to simplify it we assume that t — O(polylogn). Let a;(/3) be 
defined such that = ui{n,n,n^). 


Corollary 6.8. Let G be a direeted PLB graph with parameters a and t, and M he a matrix, whose non¬ 
zero entries are a subset of non-zero entries of the adjaeeney matrix of G. Let 0 < /3 < 1 &e sueh that 
2 -f /3{2 — a) = uj{l -\- /3(1 — a)), fd < 1/a. Moreover, assume t = O(polylogn). Then, we ean compute the 
determinant of M in randomized time. 


22 









We observe that when M is symmetric then B is sparse as in the proof of Theorem |6.3[ In such a case 
computing B{HJV~^) takes 0{nm) time instead of w(n,n,n/c) time and we obtain similar bounds as in 
Theorem 16.31 

Corollary 6.9. It is possible to drop the generic assumption from Theorem \6.3\ by increasing the running 
time by polylogarithmic factors. 


6.4 Linear System Solution and Matrix Inverse 


In order to solve linear system with matrix M we will extend the idea of the algorithm from the previous 
section, i.e., we first run the above algorithm to compute the determinant of M and store all intermediate 
results of this computation. Let u be an n length vector, then to find a vector x such that Mx = v we 
compute 

X = M~^v = 


7 Hb 


o 

—1 

0 Hd 


0 Jd_ 


M- 


Now we express inverse of a M in the block form 


1 _ 

I O' 

'Z-i 0 

7 -Mb{Md)-^' 


_-{Md)-^C / 

. c 

0 I 


Now we plug in the equation (Md) ^ = V to obtain 

M-^v = 


I o' 



0 


7 -MbV-^f-^V 

-V-^F-^VC I 


c 

1 

1 

1 


0 / 


V. 


(5) 


( 6 ) 


Observe that all matrices in the above have been computed during the computation of the determinant, so 
computing M~^v takes 0{n^) time. Then we compute M~^v using ([^ in O(n^) time. 


Corollary 6.10. Theorem \ 6.3\ Theorem 6.7 and Corollary \6.S\ can be extended to compute a solution to 
linear system at the cost of using O(n^) additional time. 


Finally, we observe that using § we can compute the inverse matrix explicitly. In the case of generic 
matrices it takes the same time as needed for transitive closure using (§, whereas in the symmetric case the 
most expensive multiplication takes 0{nm) time instead of 0(uj(n,n,nk)) time, so we obtain the following 
corollary. 


Corollary 6.11. Theorem \6.3[ Theorem \6. 7| and Corollary \ 6. 9\ can be extended to compute a inverse matrix 
in the same asymptotic time. 


6.5 Perfect Matching 

As observed by Lovasz m in 1979 checking whether a graph contains a perfect matching can be done 
in 0{n^) time using one determinant computation for an appropriately defined skew-symmetric matrix. 
However, an algorithm for finding such perfect matching was shown 25 years latter |45j . Here, we reuse this 
idea in the case of PLB graphs to check whether a graph contains a perfect matching and to find one. The 
running time of the resulting algorithms is the same as the running time of determinant computation for 
symmetric matrices. 

Let us define for a graph G a skew-symmetric adjacency matrix M in the following way 




Zij if ij G E and i < j 

—Zj^i if ij G E and i > j , 
0 otherwise 


where for each edge ij € E the variables Zij are distinct. 
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Theorem 6.12 ([41jl. Let M he a matrix obtained from M hy substituting uniformly at random elements 
from T for variables. If G has a perfect matching then det(M) ^ 0 with high probability, whereas when G 
has no perfect matching then det(M) = 0. 


We can observe that in our derivation of Corollary |6.9| we have used only the fact that non-zero structure of 
the matrix is symmetric, so the same time bounds hold for computation of a determinant of a skew-symmetric 
matrix M. Hence, assuming that t = O(polylogn), to test whether a PLB graph contains a perfect matching 


we need 0(n 


2 + 


(c..-2)(2-c) 

{u,-2)a+3-a, 


randomized time. This is faster then Micali-Vazirani algorithm [44], that works 


in time, when a < ^/-7’^^+26^i5-i-3m-9 ^ 1.09042. However, what is left is to construct 

the perfect matching when we know that the graph contains one. In order to do it we need to compute 
{M~^)a = = (^ ~ BD~^G)~^. By our preconditioning we have 


M-^ 


■/ Hb 

7 

0 ■ 

0 Hd 

0 

Jd 


M-\ 


Using this equation and the equality (M ^)a = Z ^ 


we obtain 


= Z-^ = Z-^ - HbJdV-^F-^VG. 


time to c omp ute (HbJdV ^)(UC'). These running times do not increase the running times stated in 


We need 0{nmBCD) time to compute VG, 0{nnk) time for {Hb{JdV ^))F ^ and finally 0{n 


Theorem 6.3 


and Corollary 6.9 Let Ga be the subgraph of G represented by Ma, i.e., the subgraph induced 


only by vertices of degree at least k. Now we can apply the following observation that is the core idea of | 


Theorem 6.13 (Procedure DeleteEdgesWithin from [32]). Given a Schur complement of Ma (i.e., Z~^) 
one can find in 0{n‘f) time a set of edges Pa C E{Ga) such that Pa can be extended to a perfect matching 
P of the whole graph G. 


Hence, we first invoke the above theorem to find Pa and then we are left to find matching in a graph 
G — E{Ga), where all edges between high degree vertices have been removed. This way the degree of a 
vertex in G — E{Ga) is bounded by k and so the total number of edges is O^mBCo)- Using the algorithm 
by Micali and Vazirani we need 0{^JnmBCD) = OirimBCo) time to extend Pa to a perfect matching on G. 
This way we obtain the following theorem. 


Theorem 6.14. 

in G in 0{(t + 1) 


Let G be a PLB graph with parameters a < 2 and t. Then, we can find a perfect matching 

,, , (^-2)(c»-l)(2-o) 

)-< {“-2)0+3-!.. (...-2)0+3-!..) time with high probability. 


6.6 Complexity of PageRank 

Let us now discuss the arithmetic complexity of exact PageRank computationj^ Computing PageRank is a 
simple version of the eigenproblem, where we are asked to hnd eigenvector for the eigenvalue which is equal 
to 1. Eigenproblems, in comparison with the determinant problem, is usually more challenging, because we 
cannot easily use preconditioning, as it can change both eigenvalues and eigenvectors. The complexity of 
this simple problem is either 0{n‘^) using [33] or 0{knm) using |22| (where k is number of invariant factors). 
However, we can show that the problem is easier on a directed PLB graph. 

We assume that we are given a graph G where out degrees satisfy power law with exponent a. The 
PageRank vector tt is the eigenvector of the following n x n matrix 

M = cP + {l-c)l/nE (7) 

where c is a damping factor which can be set between 0 and 1 (typically 0.85), matrix P is an adjacency 
matrix of G dehned with rules: Pij = 0 if there are no edges from i to j and Pij = l/outdeg(j) otherwise 

®We note that the approximate iterative methods used in practice have worse theoretical running time bounds. 


24 














and E’ is a matrix whose all entries equal one. In other words, having defined M, the PageRank vector is 
an eigenvector tt of M corresponding to eigenvalue /i = 1, i.e., the PageRank vector satisfies the following 
equations 


TT^ = tt’^M (8) 

TT^e = 1 (9) 


where e is a vector of size n whose all entries equal one. 

As usual we start with sorting the rows of P in decreasing order according to the number of non-zero 
entries, i.e., out-degrees of corresponding vertices. Let us consider the following matrix decomposition 


P = 


A 

C 


B 

D ’ 


where [A, B] contains rows with more than k non-zero entries for 1 < fc < n. As previously the size of A 
is denoted by Uk, whereas the number of non-zero entries in D is denoted by itid- We define as well the 
corresponding decomposition of tt 

TT = (tTA, TTd) 

Using this block form we can rewrite the equation as follows 

_ cA+{l-c)/nEA cB + {l-c)/nEB 
cC + {1 — c)/nEc cD -|- (1 — c) jnEi) 

where Ea-, Eb, Eq and Eo are matrices of the appropriate size whose all entries equal one. In turn, the 
equation (|^ can be rewritten as 


7r^(/ — M) = TT^ 


I-Ma 

Me 


Mb 

I-Mb 


= 0 . 


We observe that / —M is an irreducible M-matrix[^ This implies by Theorem 4.16 from [6] that every leading 
or trailing principal and proper submatrix of / — M is nonsingular. In particular I — Mb is nonsingular. 
Let E[b he a random Hankel matrix and let Ju be a random diagonal matrix. Then by the preconditioning 
from Section 6.3 (/ — Mb)HbJb has one invariant factor. We observe that multiplication of the matrix 


I — Mb = I — cD + {1 — c)/nEB by a vector takes 0(mB) arithmetic operations, so by Theorem 6.4 


we 


can compute Frobenius normal E and the transition matrix V of (/ — Mb)HbJb in 0{nmB) arithmetic 
operations. 

The stochastic complement of Ma in / — M is the following matrix 


Sa=Ma + Mb{I - Mb)~^Mc = Ma + MbJbHbV-^E-^VMc- 


Computing Sa requires 0(nmB) arithmetic operations to compute VMc, 0(nnk) arithmetic operations to 
compute F~^{VMc) and MbJbHb^ 0{uj{n,n,nk)) arithmetic operations to compute {MbJbHb)V~^, and 
finally 0{nn‘^~) arithmetic operations to compute {MbJbHbV~^){F~^VMc)- 
Now using the equations from |43) we obtain 

T^A^A = TT^', 

ttI = tt^Mb{I - Mb)-^ = ttIMbJbHbV-^F-W, 

which means that tta is a stationary distribution for the smaller matrix Sa and can be computed in O(n^) 
arithmetic operations |35j . Then in order to compute ttb we need 0{v?) arithmetic operations. We note that 
0(a;(n, n, rife)) dominates 0(nn^“^) , Oinuk) and 0{n‘^), whereas 0{nmB) dominates 0{ri^). Therefore we 
obtain the following theorem. 

matrix X is said to be M-matrix when it can be written as X = al — Y where all entries of Y are nonnegative and a is 
greater or equal then the spectral radius of B. The spectral radius of M is 1 as it is a stochastic matrix. 
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Theorem 6.15. Let G he a directed PLB graph with parameters a and t, and let 1 < k < + 1)^“^/“. 

Then, we can compute PageRank of G with 0(n?(t + + uj{n,n,n{t + arithmetic 

operations with high probability. 

We note that the above idea can be combined together with iterative methods. In such a case instead 
of using the stochastic complement of Ma we shall use the stochastic complement of Mjj, i.e., Sd = 
Mjj + Me {I — Ma)~^ Mg. However, we explicitly compute only the inverse (/ — Ma)~^ using Strassen’s 
fast matrix inverse, but we not execute other multiplications and keep Sg in the lazy form as given by this 
equation. We can apply iterative methods to compute the stationary distribution of using this lazy form. 
We have implemented this approach and on a single computer it can reduce the time needed for PageRank 
computation by a factor of two on graphs that have approximately 100000 nodes, e.g., WikiTalk network. 
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Graph 

n 

m 

Cl 

C2 

Ct 

t 

A/y/rn 

Amazon (directed, in-degree -|- out-degree) 

241761 

1131217 

5 

0.615102 

3.198 

22.2994 

0.3996 

AstroPh (directed, in-degree -1- out-degree) 

17903 

393944 

1.2888 

0.208271 

2.0189 

21.0207 

1.606 

Cities (directed, in-degree -|- out-degree) 

3144 

34753 

0.9652 

- 

1.9126 

1.8661 

10.6425 

CondMatt (undirected) 

21363 

182572 

4.7952 

2.15535 

5.2849 

26.1942 

0.65296 

Dblp (undirected) 

718115 

5573812 

2.6633 

5.80862 

3.4134 

9.589 

0.3808 

Enron (undirected) 

33696 

361622 

1.2549 

0.610801 

2.2674 

3.4682 

2.2998 

Epinions (directed, in-degree -1- out-degree) 

32223 

443506 

1.2166 

- 

1.8863 

3.8008 

4.1894 

EuAll (directed, in-degree -|- out-degree) 

34203 

151132 

3.1966 

1.37106 

2.4201 

3.756 

3.8635 

Facebook (undirected) 

59691 

1456818 

0.8077 

- 

1.6668 

6.2728 

0.8409 

HepPh (directed, in-degree -|- out-degree) 

12711 

139965 

5 

3.00723 

5.2231 

70.1391 

1.0104 

LiveJournal (directed, in-degree -|- out-degree) 

3828682 

65349587 

2.1985 

4.64595 

2.5893 

18.8438 

2.8287 

NotreDame (directed, in-degree -|- out-degree) 

53968 

296228 

2.2113 

1.55224 

2.6274 

9.5051 

14.0243 

Slashdot (directed, in-degree -h out-degree) 

71307 

841201 

1.4678 

0.166008 

2.0236 

3.8451 

5.5191 

WikiTalk (directed, in-degree -|- out-degree) 

111881 

1477893 

1.5124 

0.177209 

2.031 

3.7847 

6.6613 

WIW (undirected) 

29406 

393797 

0.5474 

- 

1.2562 

0 

1.0135 

YouTube (undirected) 

495957 

3873496 

1.0395 

0.66258 

2.2474 

1.8672 

12.9103 

AstroPh (directed, out-degree) 

17903 

393944 

3.1737 

4.6716 

3.5199 

32.734 

- 

Epinions (directed, out-degree) 

32223 

443506 

2.0569 

1.21022 

2.4379 

6.3751 

- 

EuAll (directed, out-degree) 

34203 

151132 

2.4122 

0.401895 

2.1407 

0 

- 

HepPh (directed, out-degree) 

12711 

139965 

4.3101 

1.10021 

4.7202 

25.2953 

- 

LiveJournal (directed, out-degree) 

3828682 

65349587 

2.2261 

8.05048 

2.7745 

12.0126 

- 

NotreDame (directed, out-degree) 

53968 

296228 

4.9269 

1.65396 

2.6162 

0.5484 

- 

Slashdot (directed, out-degree) 

71307 

841201 

1.5542 

0.376638 

2.165 

3.3024 

- 

WikiTalk (directed, out-degree) 

111881 

1477893 

1.1869 

- 

1.9364 

0.9833 

- 

Amazon (directed, in-degree -1- out-degree) 

241761 

1131217 

5 

- 

1.8072 

0 

0.3996 

CondMatt (undirected) 

21363 

182572 

5 

0.420346 

2.1699 

0 

0.65296 


Table 1: Adjustment of PLB universal constants. 
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Figure 1: Real-World networks are PLB: definition adjustment 
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Figure 2: Real-World networks are PLB: definition adjustment for t=0 



alpha 


Figure 3: The exponent of the running time of our algorithms for counting triangles. Here PLBN stands for 
PLB neighborhoods. #edges is the number of edges in a graph, and folklore is #edges multiplied by 3/2, as 
the well-known algorithm for counting triangles runs in time. 
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Figure 4: The exponent of the running times of our algebraic algorithms for power law graphs and matrices, 
whose nonzero entries correspond to the edges of a power law graph. Symmetric shows the complexity of 
determinant algorithm for symmetric matrices as well as perfect matching algorithm. General depicts the 
complexity of algorithms for determinant, PageRank, matrix inverse, linear system solving and transitive 
closure in matrices that do not need to be symmetric. The complexities are derived using the bound on 
a;(n, n, Uk) given in [89] . Our results are compared to the running times of algorithms that work for arbitrary 
graphs and matrices. Note that the bound of 0{-\/nnn) is only known for the perfect matching algorithm. 
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